CASE-QA: Context and Syntax embeddings for Question Answering On Stack Overflow

نویسندگان

Ezra Winston

Graham Neubig

William Cohen

چکیده

Question answering (QA) systems rely on both knowledge bases and unstructured text corpora. Domain-specific QA presents a unique challenge, since relevant knowledge bases are often lacking and unstructured text is difficult to query and parse. This project focuses on the QUASAR-S dataset (Dhingra et al., 2017) constructed from the community QA site Stack Overflow. QUASAR-S consists of Cloze-style questions about software entities and a large background corpus of communitygenerated posts, each tagged with relevant software entities. We incorporate the tag entities as context for the QA task and find that modeling co-occurrence of tags and answers in posts leads to significant accuracy gains. To this end, we propose CASE, a hybrid of an RNN language model and a tag-answer co-occurrence model which achieves state-ofthe-art accuracy on the QUASAR-S dataset. We also find that this approach — modeling both question sentences and context-answer co-occurrence — is effective for other QA tasks. Using only language and co-occurrence modeling on the training set, CASE is competitive with the state-of-the-art method on the SPADES dataset (Bisk et al., 2016) which uses a knowledge base.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Assessing the Performance of Question-and-Answer Communities Using Survival Analysis

Question-&-Answer (QA) websites have emerged as efficient platforms for knowledge sharing and problem solving. In particular, the Stack Exchange platform includes some of the most popular QA communities to date, such as Stack Overflow. Initial metrics used to assess the performance of these communities include summative statistics like the percentage of resolved questions or the average time to...

متن کامل

Creating Causal Embeddings for Question Answering with Minimal Supervision

A common model for question answering (QA) is that a good answer is one that is closely related to the question, where relatedness is often determined using generalpurpose lexical models such as word embeddings. We argue that a better approach is to look for answers that are related to the question in a relevant way, according to the information need of the question, which may be determined thr...

متن کامل

Answering Live Questions from Heterogeneous Data Sources SMART in Live QA at TREC 2016

A significant portion of information is today available in a digital format. However, users still face difficulties in accessing it. A big portion of the challenge consists in designing efficient approaches for reasoning over heterogeneous data sources. In this paper, we describe the participation of the Semantic Search and Question Answering group (SMART) in Live QA track at TREC 2016. SMART s...

متن کامل

Detecting Duplicate Posts in Programming QA Communities via Latent Semantics and Association Rules

Programming community-based question-answering (PCQA) websites such as Stack Overflow enable programmers to find working solutions to their questions. Despite detailed posting guidelines, duplicate questions that have been answered are frequently created. To tackle this problem, Stack Overflow provides a mechanism for reputable users to manually mark duplicate questions. This is a laborious eff...

متن کامل

Investigating Embedded Question Reuse in Question Answering

The investigation presented in this paper is a novel method in question answering (QA) that enables a QA system to gain performance through reuse of information in the answer to one question to answer another related question. Our analysis shows that a pair of question in a general open domain QA can have embedding relation through their mentions of noun phrase expressions. We present methods f...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

CASE-QA: Context and Syntax embeddings for Question Answering On Stack Overflow

نویسندگان

چکیده

منابع مشابه

Assessing the Performance of Question-and-Answer Communities Using Survival Analysis

Creating Causal Embeddings for Question Answering with Minimal Supervision

Answering Live Questions from Heterogeneous Data Sources SMART in Live QA at TREC 2016

Detecting Duplicate Posts in Programming QA Communities via Latent Semantics and Association Rules

Investigating Embedded Question Reuse in Question Answering

عنوان ژورنال:

اشتراک گذاری